This webpage is for an old version of the course; content may be out of date!

CSE 258: Web Mining and Recommender Systems

Instructor: Julian McAuley (jmcauley@eng.ucsd.edu), CSE 4102

Winter 2017, Monday/Wednesday 18:30-19:50, Peterson Hall 108



"All that befalls you is part of the great Web."
-Marcus Aurelius


CSE 258 is a graduate course devoted to current methods for recommender systems, data mining, and predictive analytics. No previous background in machine learning is required, but all participants should be comfortable with programming (all example code will be in Python), and with basic optimization and linear algebra.

The course meets twice a week on Monday/Wednesday evenings, starting January 9. Meetings are in Peterson Hall 108.

There is no textbook for the course, though chapter references will be provided from Pattern Recognition and Machine Learning (Bishop), and from Charles Elkan's 2013 course notes.

Office hours: I'll hold office hours on Tuesdays 9:00-13:00 in CSE 4102. The course TAs will hold office hours on Mondays and Fridays 10:00-13:00pm in B250A. For other discussions see the course's Piazza page.

Note that there will be no class on Jan 16 (MLK Day) or Feb 20 (Presidents' Day).

Part 1: Methods

WeekTopicsFilesReferencesSlidesPodcastHomework
1 (Jan 9/Jan 11) Supervised Learning: Regression
  • Least-squares regression
  • Overfitting & regularization
  • Training, validation, and testing
50k beer reviews
non-alcoholic beer reviews
week1.py
Bishop ch.3
Elkan ch.3,6
introduction & outline
lecture 1 (w/ annotations)
lecture 2 (w/ annotations)
lecture 1
lecture 2
Homework 1
due Jan 23
2/3 (Jan 18/23) Supervised Learning: Classification
  • Logistic regression
  • SVMs
  • Multiclass & multilabel classification
  • How to evaluate classifiers
50k book descriptions
5k book cover images
week2.py
Bishop ch.4
Elkan ch.5,8
lecture 3 (w/ annotations)
lecture 4 (w/ annotations)
lecture 3
lecture 4
Homework 2
due Feb 6
3/4 (Jan 25/30) Dimensionality Reduction & Clustering
  • Singular value decomposition & PCA
  • K-means & hierarchical clustering
  • Community detection
facebook ego network
week3.py
assignment 1 data
Bishop ch.9
Elkan ch.13
lecture 5 (w/ annotations)
lecture 6 (w/ annotations)
lecture 5
lecture 6
Assignment 1
due Feb 27

Part 2: Applications

WeekTopicsFilesReferencesSlidesPodcastHomework
4/5 (Feb 1/6) Recommender Systems
  • Latent-factor models
  • Collaborative filtering
Elkan ch.11 lecture 7 (w/ annotations)
lecture 8 (w/ annotations)
assignment 1
lecture 7
lecture 8
Homework 3
due Feb 20
5/6 (Feb 8/13) Text Mining
  • Sentiment analysis
  • Bags-of-words
  • TFIDF
  • Stopwords, stemming, and topic models
week5.py
Elkan ch.12 lecture 9 (w/ annotations)
lecture 10 (midterm review) (w/ annotations)
lecture 9
lecture 10
6 (Feb 15) MIDTERM
fa15 midterm (CSE255)
fa15 midterm (CSE190)
sp15 midterm (CSE190)
week6.py
Assignment 2
due Mar 13
7 (Feb 22) Text Mining ctd.
lecture 11 (w/ annotations)
assignment 2
lecture 11
Homework 4
due Mar 6
8 (Feb 27/Mar 1) Network Analysis
  • Power-laws and small-worlds
  • Random graph models
  • triads and weak ties
  • HITS and PageRank
Elkan ch.14
Easley & Kleinberg
lecture 12 (w/ annotations)
lecture 13 (w/ annotations)
lecture 12
lecture 13
9 (Mar 6/8) Online advertising
  • Matching & marriage problems
  • AdWords
  • Bandit algorithms
tensorflow.py
Mining Massive Datasets lecture 14 (w/ annotations)
lecture 15 (w/ annotations)
lecture 14
lecture 15
10 (Mar 13/15) Modeling Temporal and Sequence Data
  • Sliding windows and autoregression
  • Temporal dynamics in recommender systems
  • Temporal dynamics in text and social networks
week10.py
lecture 16 (w/ annotations)
lecture 17 (w/ annotations)
lecture 16
lecture 17